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Two  sets  of  prototype  screens  for  a  complex,  computerized  analysis  tool 
were  evaluated  using  three  usability  analysis  techniques.  The  experimental 
usability  method  identified  more  interface  design  problems  of  a  severe 
nature  than  the  other  methods  did  and  gave  a  clear  indication  of  which 
prototype  design  to  choose  for  the  final  development  process.  The 
implications  for  selecting  appropriate  usability  techniques  and  using  them 
collectively,  as  a  process,  are  discussed. 
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EXECUTIVE  SUMMARY 


The  Hardware  versus  Manpower  III  (HARDMAN  III)  suite  of  personal  computer  (PC)- 
based  analysis  tools  was  developed  to  operate  using  an  International  Business  Machines  (IBM)- 
compatible  machine  with  a  286  processor  and  the  Microsoft  disk  operating  system  (MS- 
DOS™).  MS-DOS™  is  a  command-line,  text-based  operating  system.  Since  the  development  of 
HARDMAN  III,  software  companies  have  developed  graphical  user  interfaces  for  use  on  IBM- 
compatible  machines.  To  take  advantage  of  the  developments  in  software  technology,  the  next 
version  of  HARDMAN  III  (which  will  be  called  Improved  Performance  Research  Integration 
Tool  [IMPRINT])  will  incorporate  the  use  of  a  graphical  user  interface  under  the  Windows™ 
operating  system.  To  facilitate  an  efficient  transition  of  the  DOS-based  version  of  HARDMAN 
III  to  the  Windows™  version  of  HARDMAN  III  ( IMPRINT),  a  usability  study  was  conducted 
on  two  computer  prototypes  that  represented  two  graphical  user  interface  designs  for 
IMPRINT. 

Three  different  usability  analysis  techniques  were  used  to  evaluate  the  two  candidate 
interfaces  for  IMPRINT:  an  experimental  evaluation,  an  individual  heuristic  evaluation,  and  a 
group  walk-through  evaluation.  The  experimental  comparison  of  the  two  computer  prototype 
designs  was  used  to  select  a  final  design  for  development.  The  study  incorporated  a  variety  of 
usability  analysis  techniques  in  an  experimental  setting.  Comparisons  of  these  techniques  were 
done  to  assess  the  overall  effectiveness  of  each  technique. 

Results  from  the  experimental  analysis  provided  a  clear  indication  of  a  difference  between 
the  two  prototypes  and  therefore  indicated  a  clear  choice  for  final  development.  Results  also 
indicated  that  task  selection  was  a  critical  component  for  the  experimental  analysis  technique. 
Results  indicated  that  task  times  and  error  data  were  significantly  different  for  the  two  separate 
sets  of  ten  tasks. 

The  findings  of  this  study  also  showed  that  different  types  of  usability  analysis 
techniques  found  different  types  of  errors.  It  is  therefore  recommended  that  a  series  or  group  of 
usability  analysis  techniques  be  used  for  any  interface  design  evaluation,  instead  of  using  a  single 
evaluation  technique. 
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USABILITY  TESTING  FOR  THE  IMPROVED  PERFORMANCE  RESEARCH 
INTEGRATION  TOOL  (IMPRINT) 


INTRODUCTION 

The  comparison  of  usability  methodologies  has  recently  appeared  in  the  literature. 

Most  articles  have  emphasized  the  relative  cost  and  effectiveness  for  each  usability  technique 
(Karat,  Campbell,  &  Fiegel,  1992;  Virzi,  Sorce,  &  Herbert,  1993;  Jeffries,  Miller,  Wharton,  & 
Uyeda,  1991).  Questions  addressed  by  this  research  include  “How  effective  is  one  particular 
usability  technique  instead  of  another?”  “How  much  does  one  technique  cost  in  comparison  to 
other  usability  analysis  techniques?”  “Are  the  benefits  of  cost  savings  of  a  given  method  reduced 
by  lack  of  problem  identification?” 

Comparisons  of  usability  techniques  are,  however,  sometimes  difficult  to  interpret. 

Many  methods  are  still  only  loosely  defined.  Overlaps  across  techniques  are  common. 

Within  each  technique,  different  interpretations  of  the  methodologies  to  be  used  can  vary.  One 
study  (Jeffries  et  al.,  1991)  used  a  heuristic  evaluation  differently  than  it  was  first  described  by 
Nielson  in  1990.  Jeffries  et  al.  (1991)  used  62  guidelines,  whereas,  Nielson  used  only  9.  Another 
study  (Virzi  et  al.,  1993)  used  several  “flavors”  of  the  heuristic  evaluation  for  comparative 
purposes.  In  one  study  (Karat  et  al.,  1992),  the  researchers  stated  the  differences  between 
experimental  testing  and  the  walk-through  sessions  lay  primarily  in  the  amount  of  data  that  were 
collected  and  the  amount  of  involvement  that  the  subjects  had  with  the  experimenters.  This  was 
done  to  “test  the  resource  requirements  of  [each]  method.” 

To  attempt  to  clarify  any  interpretations  about  the  usability  techniques  typically  used, 
descriptions  of  each  technique  and  variations  associated  with  it  are  provided  in  Table  1.  First,  we 
begin  with  the  experimental  technique.  Using  the  experimental  method,  subjects  are  asked  to 
perform  tasks  using  a  computer  interface,  and  subjects’  interactions  with  the  interface  are  then 
recorded.  Although  many  data  collection  metrics  have  been  developed  and  used  for  the 
experimental  technique,  they  all  generally  fall  into  two  categories:  time  and  errors.  Subjects’ 
interactions  with  the  interface  are  almost  always  “task  based”  and  not  “free  form.”  Subjects 
typically  are  not  encouraged  to  make  interface  suggestions  during  the  session. 

The  second  usability  analysis  technique  is  the  individual  walk-through,  or  think-aloud 
technique.  This  procedure  involves  allowing  a  subject  to  interact  with  an  interface,  and  the 
subject  is  encouraged  to  vocalize  any  problems  encountered  with  the  computer  interface.  This 
technique  might  also  be  a  “task-based”  interaction  or  it  might  be  a  more  “free  form”  interaction  in 
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which  no  tasks  are  given.  This  method  can  sometimes  be  augmented  with  usability  guideline 
information,  which  is  given  to  the  subject  to  help  him  or  her  identify  problems.  If  usability 
guidelines  are  given,  the  technique  is  usually  then  referred  to  as  a  heuristic  evaluation.  Data 
collection  is  usually  in  the  form  of  comments  and  suggestions  and  not  in  the  “time  and  errors 
format  that  characterizes  the  experimental  usability  techniques.  Participants  can  vary  in 
experience;  however,  Desurvive,  Lawrence,  and  Atwood  (1991)  and  Nielson  (1992)  have  shown 
that  for  a  heuristic  evaluation,  human  factors  professionals  give  better  results  than  non-human 
factors  evaluators. 


Table  1 

(Characteristics  of  Usability  Techniques 


Type  of 
interaction 

Data 

collection 

Usability 

guidelines 

Usability 

experience 

Experimental 

Task 

Both 

Not  given 

Mixture 

Individual 

Heuristic 

Walk-through 

Usually  free  form 
Task  or  free  form 

Subjective 

Subjective 

Given 

Not  given 

Human  factors 
Mixture 

Group 

Pluralistic 

Walk-through 

Task  or  free  form 
Task  or  free  form 

Subjective 

Subjective 

Usually  not  given 
Usually  not  given 

Wide  mixture 
Mixture 

The  final  usability  technique  is  a  group  evaluation  in  which  evaluators  are  brought 
together  and  encouraged  to  talk  about  interface  problems  that  they  identify  collectively.  This 
may  be  called  a  “cognitive  walk-through”  or,  more  recently,  a  faster  paced  version  has  been 
named  a  “cognitive  jog  through”  (Rowley  &  Rhoades,  1992).  Also  the  group  may  or  may  not  be 
given  a  set  of  usability  guidelines,  and  it  may  or  may  not  be  encouraged  to  work  with  the 
interface  in  a  task-based  scenario  or  in  a  more  “free  form”  scenario.  Participants  professional 
experience  and  background  can  vary.  In  fact,  one  researcher  (Bias,  1991)  proposes  the  pluralistic 
methodology  which  uses  a  group  with  the  widest  amount  of  experience  possible. 

Using  this  simple  catagorization  scheme  for  usability  methods,  a  comparison  of  the  cost 
and  effectiveness  of  the  various  methods  is  easier  but  by  no  means  completely  clear.  The  desire 
to  use  one  technique  instead  of  another  is  driven  by  cost  and  effectiveness  concerns.  However, 
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the  literature  is  unclear  which  variant  of  which  technique  (experimental,  individual  or  group  walk¬ 
throughs)  is  the  “best.” 

Karat  et  al.  (1992)  found  that  in  comparison  to  the  individual  and  group  walk-throughs, 
the  experimental  method  identified  the  largest  number  of  problems  and  identified  problems 
missed  by  the  other  two  techniques.  Cost  analysis  also  showed  that  the  experimental  usability 
technique  used  the  same  or  less  time  to  identify  each  problem.  As  mentioned  previously,  the 
experimental  and  individual  walk-throughs  differed  only  in  the  amount  of  data  collected  and  the 
amount  of  involvement  the  subjects  had  with  the  experimenters. 

Contrary  to  the  findings  of  Karat  et  al.  (1992),  Jeffries  et  al.  (1991)  reported  that  heuristic 
evaluations  found  the  most  problems  with  the  lowest  cost.  However,  Jefferies  used  user 
interface  (UI)  specialists  who  were  “members  of  a  research  group  in  human-computer  interaction, 
[and]  had  backgrounds  in  behavioral  science  as  well  as  experience  providing  usability  feedback  to 
product  groups.”  In  contrast,  Karat  used  “predominantly  end  users  and  developers  of  graphic 
user  interface  (GUI)  systems,  along  with  a  few  UI  specialists  and  software  support  staff.” 

In  another  study,  Virzi  et  al.  (1993)  found  that  of  three  usability  techniques  (heuristic, 
think-aloud  [or  individual  walk-through],  and  experimental),  each  was  “roughly  equivalent  in  their 
ability  to  detect  a  core  set  of  usability  problems  on  a  per-evaluator  basis.  However,  the  heuristic 
and  think-aloud  evaluations  were  generally  more  sensitive,  uncovering  a  broader  array  of 
problems  in  the  user  interface.”  Again,  as  in  the  Jefferies  (1991)  study,  the  “heuristic  evaluation 
[was]  conducted  by  in-house  usability  experts.”  Thus,  taken  altogether,  an  understanding  of  the 
cost-effectiveness  of  each  method  must  include,  not  only  an  understanding  of  the  method,  but 
also  of  the  subjects  or  evaluators,  the  type  of  information  yielded  by  the  method,  and  the  actual 
resources  involved  using  the  method. 

OBJECTIVES 

The  goal  of  this  study  was  twofold:  One,  the  selection  of  one  of  two  different  interface 
design  prototypes  for  a  fairly  complex  analysis  tool  and  the  continued  refinement  of  the  selected 
design.  Two,  to  compare  usability  analysis  techniques,  with  an  emphasis  on  using  the  techniques 
in  a  sequence  as  a  continuing  process.  Techniques  were  selected  to  cover  the  range  of  currently 
employed  techniques  and  a  comparison  was  done  to  confirm  any  perceived  strengths  or 
weaknesses.  The  three  usability  methods  were  (1)  an  experimental  evaluation,  (2)  an  individual 
heuristic  walk-through  with  usability  guidelines,  and  (3)  group  walk-through. 
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SUBJECTS 


Twenty  subjects  participated  in  the  experimental  evaluation,  all  of  whom  were 
employees  of  the  U.S.  Army  Research  Laboratory  (ARL).  Of  those  20  subjects,  10  participated 
in  the  heuristic  evaluation  and  10  participated  in  the  group  walk-through  evaluation.  The  subjects 
had  various  educational  and  professional  backgrounds,  but  all  these  subjects  were  equal  in 
experience  with  the  tasks  to  be  performed  with  the  software.  They  had  each  received  a  3-day 
training  course  on  the  predecessor  DOS-based  software  Hardware  versus  Manpower 
(HARDMAN  III)  but  had  not  used  the  software  since  the  course. 

MATERIALS  AND  EQUIPMENT 

We  developed  our  process  and  conducted  our  usability  evaluations  during  the  design  of 
the  U.S  Army  computer  program  entitled  IMPRINT.  Two  IMPRINT  prototypes  were 
developed.  The  program  was  designed  to  run  under  the  Windows™  operating  system. 
Prototyping  of  the  program  was  done  using  the  ToolBook™  development  environment,  which 
also  runs  under  the  Windows™  operating  system.  IMPRINT  is  the  Windows™  version  of  a 
DOS™  program  originally  named  HARDMAN  III.  Thus,  HARDMAN  III  provided  much  of  the 
groundwork  for  the  conceptual  design  of  IMPRINT.  HARDMAN  III  is  a  very  complex  task 
network  sequencing  program,  and  consequently,  the  two  IMPRINT  prototypes  were  very 
complex  as  well.  Both  prototypes  mimicked  the  functionality  of  the  final  program.  The 
interactive  prototype  we  developed  was  either  used  directly  on  a  computer  or  was  displayed  on 
a  large  screen  television.  Twenty  subjects  were  each  tested  individually  using  the  same  GateWay 
2000  33-MHz  computer  with  a  color  video  graphics  array  (VGA)  monitor.  Data  during  the 
experimental  section  were  collected  by  use  of  a  video  camera  and  by  the  computer  the  subjects 
were  using  during  the  experiment.  The  computer  recorded  when  each  task  started  and  when  each 
task  was  completed,  as  well  as  each  mouse  click  in  between  the  start  and  end  times. 


PROCEDURE 

The  interactive  screen  prototypes  were  presented  in  a  counterbalanced  scheme  so  that 
the  time  and  errors  for  each  could  be  compared.  Although  all  subjects  had  received  a 
HARDMAN  III  training  class  some  months  before  the  experimentation,  they  received  refresher 
training  immediately  before  the  experiment.  Subjects  had  to  successfully  complete  five  training 
tasks  before  proceeding  with  the  experiment.  The  experiment  consisted  of  two  sets  of  ten  tasks 
that  would  be  performed  using  the  software.  (Ten  subjects  received  one  set  of  ten  tasks;  ten 
subjects  received  the  other  set  of  ten  tasks.)  The  set  of  ten  tasks  was  presented  in  a  different 
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random  order  for  each  subject.  Subjects  were  not  told  to  work  as  fast  as  they  could  or  to  make 
as  few  errors  as  they  could.  They  were  told  that  they  were  being  recorded  by  the  computer  and 
to  complete  each  task  to  the  best  of  their  ability. 

The  heuristic  evaluation  was  conducted  immediately  after  the  experimental  section.  Of 
the  20  subjects  who  participated  in  the  experimental  evaluation,  10  were  randomly  selected  for 
the  heuristic  evaluation  and  given  the  set  of  usability  guidelines  shown  in  Table  2  (Nielsen  & 
Molich,  1990).  Subjects  were  then  instructed  to  use  the  guidelines  to  identify  usability  problems 
with  each  interface. 


Table  2 

Usability  Guidelines 


Simple  and  natural  dialogue 
Speak  the  user’s  language 
Minimize  user  memory  load 
Be  consistent 
Provide  feedback 
Provide  clearly  marked  exits 
Provide  shortcuts 
Good  error  messages 
Prevent  errors 


Subjects  were  told  to  take  as  much  time  as  they  needed.  Subjects  could  choose  to  use  the 
computer  on-line  versions  of  the  prototypes  or  be  given  a  printout  of  each  screen  from  which  to 
work.  Many  subjects  wrote  their  comments  directly  onto  the  heuristic  guideline  sheet  that  was 
given  to  them. 

Finally,  the  group  walk-through  technique  used  subjects  who  were  the  remaining  ten  from 
the  previously  conducted  experimental  evaluation.  Subjects  met  in  one  room  facing  a  large  screen 
monitor  displaying  the  prototype.  One  experimenter  served  as  the  moderator  for  the  session. 
The  session  was  “task  based”  in  that  the  same  tasks  that  were  used  previously  for  the 
experimental  section  were  used  again  for  the  group  walk-through.  Task  lists  were  given  to  each 
of  the  subjects,  and  then  each  task  was  presented  for  evaluation  with  the  interface.  Subjects 
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vocalized  any  concerns  they  had  with  the  interface  while  each  task  was  being  exercised.  Data 
were  collected  by  using  a  video  camera  and  by  a  second  experimenter  taking  notes  during  the 
entire  session. 

RESULTS 

As  Figure  1  illustrates,  the  experimental  evaluation  identified  more  problems  than  did  the 
heuristic  or  group  evaluation  techniques.  The  experimental  evaluation  technique  identified  a  total 
of  15  problems. 

Severity  ratings  of  each  problem  identified  were  calculated  using  a  (high,  medium,  and 
low)  three-point  scale  which  was  based  on  a  subset  of  the  Problem  Severity  Classification  (PSC) 
ratings  used  by  Karat  et  al.  (1992).  The  subset  we  used  was  the  impact  of  the  usability  problem 
on  the  end  user’s  ability  to  complete  the  task.  Two  human  factors  experts  conducted  the 
severity  ratings.  Each  human  factors  expert  did  his  own  rating  independently;  then  the  ratings 
were  compared  for  differences.  If  there  were  any  disagreements,  discussion  ensued  until  a 
consensus  was  reached.  Figure  2  shows  the  severity  rating  scores  for  the  problems  found  with 
each  usability  technique.  As  Figure  2  indicates,  the  experimental  method  identified  the  most 
number  of  high  severity  problems,  a  total  of  six. 


Empirical  Heuristic  Group 

Walk-through 


Figure  1.  Number  of  problems  identified. 


Figure  2.  Problem  severity  identification. 


Table  3 
Results 


Prototype  A 

Prototype  B 

Task  Set  1  Mean  Task  Time 

78.66  seconds 

114.83  seconds 

Task  Set  2  Mean  Task  Time 

80.50  seconds 

81.10  seconds 

Task  Set  1  Error  Scores 

3.91  substeps 

7.49  substeps 

Task  Set  2  Error  Scores 

4.15  substeps 

3.08  substeps 

During  the  experimental  evaluation,  the  two  prototypes,  here  labeled  Prototype  A  and 
Prototype  B,  were  evaluated  for  the  time  and  errors  obtained  with  each  prototype  during  the  set 
of  ten  tasks.  We  found  that  for  one  group  of  ten  tasks,  Prototype  A  had  significantly  lower  time 
and  error  scores  than  did  Prototype  B.  However,  for  another  group  of  ten  tasks,  time  and  error 
scores  were  not  significantly  different  for  each  prototype.  As  shown  in  Table  3,  the  average  task 
time  for  Prototype  A  was  78.66  seconds  and  for  Prototype  B  was  1 14.83.  However,  for  the 
second  group  of  ten  tasks,  the  average  task  time  for  Prototype  A  was  80.5  seconds  and 
Prototype  B  was  81.1  seconds.  An  error  score  for  each  of  the  two  prototypes  was  also 
calculated  by  taking  the  “ideal”  or  “perfect”  number  of  sub  steps  and  subtracting  from  the  actual 
number  of  sub  steps.  For  the  first  group  of  ten  tasks,  the  error  scores  were  3.91  for  Prototype  A 
and  7.49  for  prototype  B.  For  the  second  group  of  ten  tasks,  error  scores  were  4.15  for 
Prototype  A  and  3.08  for  Prototype  B.  A  2  (prototypes)  x  10  (tasks)  repeated  measures 
analysis  of  variance  (ANOVA)  of  the  time  data  was  conducted  for  both  groups  of  the  10 
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subjects.  Results  for  the  first  group  indicated  a  significant  main  effect  of  prototype,  F(l,9)  — 
14.39,  p  <  .01,  as  well  as  task,  F(9,81)  =  14.85,  p  <  .01.  The  effect  of  Prototype  x  Task 
interaction  was  also  significant  F(9,81)  =  5.15  p  <  .05.  The  second  group  showed  no  main  effect 
of  prototype  F(9,81)  =  .42,  p  >  .05;  however,  they  did  show  a  significant  effect  of  task  type 
F(9,81)  =  5.70,  p  <  .01  as  well  as  an  effect  of  Prototype  x  Task  interaction  F(9,81)  =  3.31  ,p< 

.01 .  The  error  data  showed  a  significant  main  effect  of  prototype  for  both  sets  of  ten  tasks, 
F(l,9)  =  8.33,  p  <.01  and  F(l,9)  =  7.44,/?  <  .05,  but  no  effect  for  task,  F(9,81)  =  7.24,/?  >  .05, 
F(1 ,9)  =  5.03  p  >  .05. 

CONCLUSION 

The  usability  analysis  process  should  be  a  combination  of  usability  analysis  techniques, 
each  of  which  has  its  own  advantages  and  disadvantages.  Together,  however,  each  technique  can 
complement  the  other  methods  and  can  collectively  be  more  powerful  than  if  used  separately-in 
other  words,  a  Gestalt  analysis.  For  this  study,  one  technique  was  not  favored  instead  of  another 
technique,  but  rather,  all  techniques  were  viewed  as  a  process.  This  makes  sense,  since  the  very 
nature  of  computer  interface  design  is  in  itself  an  extended  process.  Usability  testing  should  not 
be  looked  at  as  a  static,  one-time  expenditure,  but  instead,  an  evolving  process.  This  process 
should  encompass  the  best  aspects  of  each  technique. 

We  used  the  experimental  method  with  the  hopes  of  finding  the  most  severe  errors.  As 
our  results  indicated,  the  most  severe  errors  were  identified  by  the  experimental  analysis 
technique.  Also,  because  of  the  unique  nature  of  the  experimental  method,  it  should  be  used  in 
any  evaluation  process.  Not  only  does  it  identify  many  severe  errors,  but  as  noted  by  Jeffries  et 
al.  (1991),  also  has  the  advantage  of  identifying  errors  that  might  never  have  been  found  by  the 
other  methodologies. 

We  would  also  like  to  point  out  that  task  selection  is  critical  to  an  effective  experimental 
evaluation.  We  found  that  different  sets  of  tasks  produced  statistically  different  sets  of  results 
for  time  and  error  data.  Task  selection  for  experimental  evaluations  has  been  characterized  as  a 
problem  similar  to  the  content  validity  issue  as  described  by  Nunnally  (Lewis,  1994;  Nunnally, 
1978).  This  area  still  warrants  further  research. 

The  experimental  evaluation  provided  much  of  the  information  needed  to  satisfy  our  first 
goal,  which  was  the  selection  of  the  best  prototype  design.  Fortunately,  for  one  group  of  ten 
tasks,  there  was  a  significant  difference  at  the  .05  level  for  time  and  error  scores.  Since  the  second 


12 


group  of  ten  tasks  did  not  produce  a  significant  difference  in  time  and  error  scores  at  the  .05  level 
for  either  prototype,  the  data  for  the  first  group  of  ten  tasks  gave  us  the  best  indication  which 
was  the  better  interface  prototype  design. 

Next,  the  heuristic  evaluation  was  given  after  the  experimental  section,  in  the  hopes  that 
subjects  would  draw  from  their  experimental  evaluation  experiences  and  be  more  likely  to  give 
severe  error  inputs.  Based  on  the  data  we  collected,  our  assumptions  were  fairly  accurate. 
Perhaps  more  importantly,  it  appeared  that  the  technique  of  using  an  experimental  followed  by 
the  heuristic  evaluation  produced  fewer  low  priority  errors. 

The  group  heuristic  walk-through  evaluation  also  used  the  same  subjects  who  had 
previously  received  the  experimental  evaluation,  with  the  hopes  that  input  would  be  based  on  the 
experience  that  the  subjects  had  received  during  the  experimental  evaluations.  However,  because 
of  logistical  problems,  the  meeting  was  not  held  soon  enough  after  the  experimental  evaluations, 
and  subjects  spent  much  of  the  evaluation  session  trying  to  remember  what  they  had  done  during 
the  experimental  evaluations.  The  group  evaluation  did,  however,  produce  a  large  number  of 
severe  errors,  second  only  to  the  experimental  method. 

The  idea  of  viewing  computer  interface  usability  testing  as  a  Gestalt  analysis,  instead  of  a 
single  technique  or  methodology,  is  an  attractive  one.  The  literature  indicates  that  some 
techniques  may  be  more  effective  than  others  in  identifying  certain  types  of  problems  and  that 
each  technique  might  complement  the  others  in  finding  all  types  of  usability  problems.  Further 
research  needs  to  be  done  to  help  clarify  this  area  as  well  as  to  identify  the  best  order  in  which  to 
use  each  methodology  in  an  overall  usability  process. 
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